test(conformance): arm the fixtures for 2026-07-28 serving and refresh the expected-failures baseline#2310
Merged
felixweinberger merged 7 commits intoJun 16, 2026
Conversation
@modelcontextprotocol/client
@modelcontextprotocol/codemod
@modelcontextprotocol/server
@modelcontextprotocol/server-legacy
@modelcontextprotocol/express
@modelcontextprotocol/fastify
@modelcontextprotocol/hono
@modelcontextprotocol/node
commit: |
|
…ture Route modern-classified requests (per-request _meta envelope, server/discover) through a createMcpHandler entry backed by the same fixture server definition; legacy-classified traffic stays on the existing stateful 2025 session path unchanged. Teach the everything client to read MCP_CONFORMANCE_PROTOCOL_VERSION, negotiate the modern era with versionNegotiation on 2026-07-28 runs, and handle the request-metadata scenario. Expected-failures burn-down (entries now passing, removed): - request-metadata (client): the server/discover negotiation probe satisfies the SEP-2575 header/_meta/retry-on--32004 checks - caching (server): 2026-era list/read results now carry ttlMs/cacheScope - http-custom-header-server-validation (server): every check is SKIPPED because the fixture registers no x-mcp-header-annotated tool; the disputed SEP-2243 error-code cells stay covered by the http-header-validation entry Comments on the remaining draft-suite entries updated to name what actually blocks them now that the 2026-07-28 path is served (multi-round-trip requests, the disputed error-code cells, error-id echo, removed-method handling).
…rged serving stack Re-ran the burn-down on the integration branch tip. The error-id-echo cells and the enveloped-initialize removed-method cell in server-stateless now pass (error responses echo the request JSON-RPC id; an initialize carrying a valid 2026 envelope is answered 404/-32601), so the entry's note no longer lists them as blockers. No entries are removed: server-stateless is still held by the disputed envelope/header error-code cells pending conformance #336, and every other entry still fails for the reason already recorded.
…26-07-28 The existing legs never pass --spec-version, so carry-forward scenarios were only exercised at their default 2025 version. Add one leg per direction that runs --suite all --spec-version 2026-07-28 against its own expected-failures file (the shared name-keyed baseline cannot express version-split outcomes), and wire both legs into the conformance workflow. - test:conformance:server:2026 - 54 passed / 21 failed checks; baseline expected-failures.2026-07-28.yaml (16 scenarios: the same failures as the draft-suite leg plus json-schema-2020-12, which fails identically at 2025). - test:conformance:client:2026 - 206 passed / 37 failed checks; baseline expected-failures.client.2026-07-28.yaml (26 scenarios: tools_call blocked by the referee mocks omitting resultType (fixed upstream, unblocks at the next published conformance release), the SEP-837 application_type check that only fires on draft-version runs, the auth scope-escalation scenarios cut short by the 2026 connection lifecycle, and the scenarios already baselined at 2025). Both legs fail on unexpected failures and stale baseline entries, same as the existing legs. The referee stays the published 0.2.0-alpha.3 pin.
…ed file The 2025 legs share a single expected-failures.yaml with separate client: and server: sections; mirror that shape for the 2026-07-28 carried-forward legs. Merge expected-failures.client.2026-07-28.yaml into expected-failures.2026-07-28.yaml (client: section added, entries and reasons unchanged), delete the client-specific file, and point test:conformance:client:2026 at the consolidated file. No entry changes.
…pha.4 0.2.0-alpha.4 makes the runner's mock servers include resultType in results, which the SDK's 2026-07-28 client decode requires; this unblocks the carried-forward client scenarios at the 2026 spec version. Lockfile change is scoped to the conformance package.
With the rejection codes aligned to the referee (-32001 for header/body mismatches, -32602 for a missing _meta envelope or protocolVersion key) and the fixture serving the 2026-07-28 path, server-stateless passes fully (21/21 checks) on the draft and 2026 server legs, so its entry leaves both baselines. The 0.2.0-alpha.4 mock servers now include resultType in results, which the SDK 2026 client decode requires, so tools_call passes on the 2026 client leg and leaves that baseline. Also reconcile the shared baseline's header with the new pin (drop the references to the previous release and to auth scenarios that the published release now ships) and restate the http-header-validation reason in the 2026 baseline in terms of the settled codes: the cells still failing are the missing-header and Mcp-Name cross-check ones, not the error-code cells.
…1 assignment The -32001 ladder cell is no longer pending an upstream error-code decision: it is the spec-assigned HeaderMismatch code. The probe classifier still never treats it as modern evidence because deployed servers overload it for session-not-found responses. Wording only; assertions unchanged.
815d374 to
7aec2d4
Compare
felixweinberger
added a commit
that referenced
this pull request
Jun 24, 2026
…h the expected-failures baseline (#2310)
This file contains hidden or bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
Sign up for free
to join this conversation on GitHub.
Already have an account?
Sign in to comment
Add this suggestion to a batch that can be applied as a single commit.This suggestion is invalid because no changes were made to the code.Suggestions cannot be applied while the pull request is closed.Suggestions cannot be applied while viewing a subset of changes.Only one suggestion per line can be applied in a batch.Add this suggestion to a batch that can be applied as a single commit.Applying suggestions on deleted lines is not supported.You must change the existing code in this line in order to create a valid suggestion.Outdated suggestions cannot be applied.This suggestion has been applied or marked resolved.Suggestions cannot be applied from pending reviews.Suggestions cannot be applied on multi-line comments.Suggestions cannot be applied while the pull request is queued to merge.Suggestion cannot be applied right now. Please check back later.
Serve the 2026-07-28 protocol revision from the conformance fixture and gate it in CI
Motivation and Context
The conformance suite can already exercise the 2026-07-28 draft revision — draft scenarios
run over the stateless per-request lifecycle, the runner tells client fixtures the resolved
version via
MCP_CONFORMANCE_PROTOCOL_VERSION, and--spec-versioncan force thecarried-forward scenarios onto the new revision — but our fixture only spoke the 2025
stateful lifecycle and our legs never passed
--spec-version, so the serving stack now onv2-2026-07-28had no referee. This wires the fixture to the new entry point, addsspec-version-forced legs, and gates everything in CI as a three-way split:
test:conformance:server(active) andtest:conformance:client:allkeep the shared baseline (expected-failures.yaml) and staygreen with no regressions.
test:conformance:server:draftexercises the draft-only scenariosagainst the new serving path; its remaining failures live in the shared baseline and burn
down as the SDK gaps close.
test:conformance:server:2026andtest:conformance:client:2026run--suite all --spec-version 2026-07-28so thecarried-forward scenarios are also exercised at the new revision. Both are gated against
one new shared baseline (
expected-failures.2026-07-28.yaml,client:/server:sectionslike the existing file) because the name-keyed shared baseline cannot express
version-split outcomes.
This also bumps the pinned
@modelcontextprotocol/conformancerelease from 0.2.0-alpha.3 to0.2.0-alpha.4: its mock servers now include
resultTypein results, which the SDK's2026-07-28 client decode requires, so the carried-forward client scenarios can complete at
the new revision.
What changed
everythingServer.ts: requests claiming the per-request_metaenvelope (includingserver/discoverand malformed claims) are served throughcreateMcpHandler, backed bythe same fixture server definition; everything else stays on the existing stateful session
path unchanged.
everythingClient.ts: readsMCP_CONFORMANCE_PROTOCOL_VERSION;tools_callspeaks the2026 lifecycle on 2026-07-28 runs (version negotiation + per-request
_meta); newrequest-metadatahandler driven byversionNegotiation: { mode: 'auto' }.test:conformance:server:2026andtest:conformance:client:2026(
--suite all --spec-version 2026-07-28), plus one new step in each of the two jobs in.github/workflows/conformance.ymlto run them.expected-failures.2026-07-28.yamlshared by the two new legs, every entrywith a reason comment, grouped by what unblocks it (15 server entries, 25 client entries).
expected-failures.yaml:request-metadata,caching,http-custom-header-server-validation, andserver-statelessnow pass and are removed;the header and the remaining entries' comments are updated to the current pin and the
settled
-32001/-32602rejection codes.@modelcontextprotocol/conformancepin 0.2.0-alpha.3 → 0.2.0-alpha.4 (lockfile changescoped to that package).
-32001assignment as pending anupstream decision (assertions unchanged).
How Has This Been Tested?
test:conformance:server42/0 (unchanged),test:conformance:server:draft6→39 passedchecks (14 expected-failure scenarios remain),
test:conformance:client:all317→324 passedchecks (15 remain), and the new legs
test:conformance:server:202663 passed / 18 failedchecks and
test:conformance:client:2026207 passed / 36 failed checks, with every failurecovered by the new baseline.
failures each make the leg exit non-zero (observed directly while burning down the
baseline entries that started passing).
pnpm -r typecheck,lint:all,docs:check, and the client package test suite pass;pnpm install --frozen-lockfilesucceeds against the updated lockfile.Breaking Changes
None — test fixture, CI wiring, baselines, and a dev-dependency pin bump in a private
package.
Types of changes
Checklist
Additional context
Remaining expected-failure groups, by what unblocks them:
input-required-result-*family (server, in boththe draft suite and the 2026 carried-forward leg) and
sep-2322-client-request-state.http-header-validationreject cells(missing
Mcp-Method/Mcp-Nameheaders and theMcp-Namecross-check) and the clientheader scenarios.
SEP-2106
$refhandling: same entries as before, mirrored into the 2026 client sectionwhere those scenarios also run.
application_typeduring DCR plus the 2026 lifecycle in the fixture's auth flow:the auth entries that only fail on the forced-2026 client leg.
json-schema-2020-12(server 2026 leg only): pre-existing fixture/baseline issue thatfails identically at 2025 in
--suite all; not a 2026-path regression.http-custom-header-server-validationpasses with all checks skipped (the fixture has nocustom-header-annotated tool yet); the SEP-2243 reject cells remain covered by the
http-header-validationentry.